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ABSTRACT 

Hemolytik (http://crdd.osdd.net/raghava/hemolytik/) 
is a manually curated database of experimentally 
determined hemolytic and non-hemolytic peptides. 
Data were compiled from a large number of pub- 
lished research articles and various databases like 
Antimicrobial Peptide Database, Collection of Anti- 
microbial Peptides, Dragon Antimicrobial Peptide 
Database and Swiss-Prot. The current release of 
Hemolytik database contains ^3000 entries that 
include ^2000 unique peptides whose hemolytic 
activities were evaluated on erythrocytes isolated 
from as many as 17 different sources. Each entry in 
Hemolytik provides comprehensive information 
about a peptide, like its name, sequence, origin, 
reported function, property such as chirality, types 
(linear and cyclic), end modifications as well as 
details pertaining to its hemolytic activity. In 
addition, tertiary structure of each peptide has been 
predicted, and secondary structure states have been 
assigned. To facilitate the scientific community, a 
user-friendly interface has been developed with 
various tools for data searching and analysis. We 
hope, Hemolytik will be useful for researchers 
working in the field of designing therapeutic 
peptides. 

INTRODUCTION 

During the past decade, there has been a renewed interest 
in peptides and peptide-based therapeutics and diagnos- 
tics. Peptide-based therapeutics have several advantages, 
including high specificity, high tissue penetration abihty, 
easy to modify, etc, over small molecules and antibody- 
based therapeutics (1,2). Despite tremendous therapeutic 



potential, so far only limited number of peptides has been 
commercialized as drugs. Development of peptide-based 
drugs is very challenging and time-consuming process. 
High toxicity and poor stabihty are the two key 
concerns while developing peptide-based drugs (1,3). 
Most of the peptides, despite their high therapeutic poten- 
tial, do not reach the chnical trials because of their toxicity 
(hemolytic activity), and because of the difficulties in their 
manufacturing. 

Toxicity of therapeutic peptides against normal eukary- 
otic cells is usually first checked by testing their hemolytic 
activity against red blood cells (RBCs) because RBCs 
provide a model system, which is physiologically 
relevant and significantly easier to work with as 
compared with other systems such as liposomes. In 
general, peptides having high hemolytic activity are not 
suitable for therapeutic use. Thus, it is essential to 
reduce the hemolytic potency of peptide without 
compromising its therapeutic activity. Presently, there is 
a paucity of information that can help in designing thera- 
peutic peptides without exhibiting hemolytic activity. A 
systematic analysis of hemolytic and non-hemolytic 
peptide sequences is necessary to dehneate the features 
of peptides responsible for their hemolytic activity, 
which can be taken care while designing a therapeutic 
peptide. 

Over the past decades, a plethora of articles describing 
therapeutic peptides (e.g. antimicrobial, anticancer, anti- 
viral, cell penetrating peptides) and their hemolytic poten- 
tial has been pubhshed. This information is scattered in 
the hterature and is thus difficult to access. To the best of 
authors' knowledge, to date, no heed has been paid to 
collect and compile the information pertinent to the hemo- 
lytic peptides and their potencies. In the present study, for 
the first time, a systematic attempt has been made to 
collect this scattered information of experimentally 
determined hemolytic and non-hemolytic peptides. This 
information is compiled in the form of a database called 
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Hemolytik. We hope that this database will be useful for 
researchers working in the field of peptide therapeutics. 

SYSTEM AND METHODS 

Data acquisition 

Data were manually collected from published literature 
and various databases, including the Antimicrobial 
Peptide Database (4), Dragon Antimicrobial Peptide 
Database (5), Collection of Antimicrobial Peptides (6) 
and Swiss-Prot (7). Hemolytic peptides were included in 
the database if those were found to be evaluated experi- 
mentally using hemolysis assay. Specific searches were 
carried out to collect all the research articles describing 
experimentally determined hemolytic peptides. In 
PubMed, advanced search using keyword such as 'hemo- 
lytic peptides' (in abstract/title) resulted into ~900 
research articles. Similar searches were also carried out 
in Swiss-Prot and other related databases using 
keywords 'hemolytic/hemolysis'. All research articles 
were downloaded and compiled systematically. 
Comprehensive information, including peptide sequences, 
chirality, end modifications, hemolytic potency, source of 
RBCs used in the assay, etc, were extracted and compiled 
for experimentally determined hemolytic and non-hemo- 
lytic peptides. 

We have made multiple entries of hemolytic peptides if 
same peptide has been tested against RBCs of different 
sources (e.g. human, sheep, rabbit). Therefore, the 
number of total entries in 'Hemolytik' is 2970, but 
unique hemolytic and non-hemolytic sequences are 1750 
and 295, respectively. 

Database architecture and web interface 

After the collection and compilation of all the informa- 
tion, database was built on an Apache HTTP Server with 
MySQL server. MySQL is an object-relational database 
management system (RDBMS), and it works at the 
backend. It provides commands to retrieve and store the 
data into the database. HTML, PHP and JAVA scripts 
were used to develop the front-end web interface. All 
common gateway interface and database interfacing 
scripts were written in the PHP and PERL programming 
language. Because Apache, MySQL and PHP technology 
are platform-independent and open-source software, these 
were preferred to develop the database. The architecture 
of Hemolytik database is shown in Figure 1. 

Organization of data 

In Hemolytik database, each peptide is assigned a unique 
entry number, and detailed information about each 
peptide has been provided. Each entry contains foUowing 
major fields: (i) name of peptide, (ii) amino acid sequence 
of peptide, (iii) chirality/conformation of peptide (L/D 
and linear/cyclic), (iv) details of modified amino acids 
(e.g. ornithine, P-alanine), (v) function/activity (e.g. anti- 
microbial) of peptide, (vi) source of peptide (e.g. snake 
venom), (vii) hemolytic activity, (viii) modifications at 
N- and C-termini of peptide (e.g. acetylation/amidation) 



and (ix) source of RBCs used in the assay (e.g. human, 
mouse). 

One of the unique features of Hemolytik database is 
that it provides structural information of peptides. 
Structure-function analysis of various therapeutic 
peptides suggests that a defined secondary structure, 
with amphipathic distribution of hydrophobic and hydro- 
philic residues, is the requisite features for their 
membranolytic activity (8,9). Therefore, understanding 
of tertiary structure of these peptides is of a considerable 
interest. However, most of the hemolytic peptides and 
their analogs' structures have not been determined and 
thus are not available in Protein Data Bank (PDB). To 
provide structural information of peptides, we have pre- 
dicted tertiary structures of all the peptides having natural 
amino acids using software PEPstr (10), which is a state- 
of-the-art method for predicting structure of bioactive 
peptides. For a given peptide sequence, PEPstr first 
predicts beta-turn types followed by PSIPRED (ll)-pre- 
dicted secondary structure states and integrates this infor- 
mation along with energy minimization and molecular 
dynamics using AMBER 11 (12) to predict tertiary struc- 
ture of peptides. Using DSSP software (13), eight second- 
ary structure states have also been assigned for each 
peptide. 

In Hemolytik database, most of the derivatives of 
existing therapeutic peptides contain non-natural/ 
modified amino acids (e.g. D-amino acids, ornithine, 
(3-alanine) and non-proteinogenic moieties (e.g. 6-amino 
hexanoic acid, p-Hydroxy cinnamic acid). To the best of 
authors' knowledge, there is no web server that can predict 
the tertiary structures of peptides having non-natural and 
modified amino acids. Therefore, we have predicted struc- 
tures of peptides having natural amino acids, D-amino 
acids, end modifications like acetylation/amidation as 
well as peptides having ornithine as modified amino acid 
by extending the use of AMBER 11 (12) in PEPstr algo- 
rithm. For changing the stereochemistry of a residue in 
D-form, flip command of AMBER 11 was used. For 
non-natural residues hke ornithine (14), force field 
hbrary for that residue was used in AMBER 11. In our 
database, we maintain tertiary structure of peptides in 
PDB format. 

Data statistics 

The current release of Hemolytik database contains 2970 
entries (Figure 1) based on hemolysis assays carried out on 
RBCs isolated from different organisms. During data 
curation and compilation, we have observed that many 
peptides have shown differential hemolytic activity on dif- 
ferent RBCs isolated from different sources (e.g. human, 
sheep, rat). Therefore, we have incorporated information 
of as many as 17 RBC sources (Figure 2). In most of the 
entries, hemolytic potencies of peptides were evaluated on 
human RBCs (2246) foUowed by sheep (167) and rat 
RBCs (148, Figure 2). We have made multiple entries of 
single peptide if the same peptide has been found to be 
tested either on different RBCs (e.g. human and rabbit 
RBCs) or at different concentrations (e.g. at 25 |.iM and 
50 |iM). Because the information related to experimentally 
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Figure 1. Architecture of Hemolytik database. 



determined non-hemolytic peptides is equally important, 
we have extracted the infonnation related to non-hemo- 
lytic peptides, and 319 entries of non-hemolytic peptides 
(unique 295) were compiled. In addition, there are few 
entries where same peptides have been reported to be 
hemolytic as well as non-hemolytic. This is probably due 
to the different experimental conditions (e.g. type of RBC 
source, incubation temperature, concentration of pep- 
tides). As we do not want to lose any information, we 
have incorporated this information as well. 

The peptides in Hemolytik database belong to diverse 
classes of therapeutic peptides and have different func- 
tions that include antimicrobial, antiparasitic, anticancer, 
cell penetrating, antiviral peptides, etc. However, most of 
the entries in Hemolytik belong to antimicrobial peptides 
followed by anticancer peptides. We have also categorized 
peptides based on their conformation (linear/cyclic) and 
chirahty, i.e. L/D/Mix (both L and D). Hemolytik 
contains 2669 entries of hiiear peptides, while 301 entries 
have information about cyclic peptides. We have compiled 
2583 entries of peptides containing only L-amino acids, 47 
entries of peptides consist of only D-amino acids and 340 
entries of peptides have both l- and D-amino acids. In 
addition, we have also collected and compiled information 



related to cheinical modifications in peptides hke 
modified ainino acids (a-aminoisobutyric acid, norleucine, 
ornithine, etc) or non-proteinogenic moieties (tetrahydroi- 
soquinohne carboxylic acid, octahydro-lH-indole- 
2-carboxylic acid, y-aminobutyric acid) and modified 
peptide chemistry (reduced amide bond v|/[CH2NH]). A 
total of 237 peptide entries have been made, which 
provide information about hemolytic peptides having 
chemical modifications. 



Integration of web tools 

Many user-friendly tools have been integrated in 
Hemolytik for extraction and analysis of peptides. 
Following are the main tools provided with the 
Hemolytik database. 

Search facility 

We have implemented four search tools that include basic 
search, conditional search, peptide search and SMILES 
search. Basic search option allows users to perform a 
search on any field of the database hke PubMed ID, 
peptide name, peptide sequence, chirality of peptide, 
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origin, nature of peptide, RBC source, etc. User can 
display any or all fields for selected searched records. 
Conditional search facility aUows users to perform 
multiple queries at a time. Under conditional search 
option, users can perform complex search by adding any 
number of queries. In addition, user can select conditions 
(e.g. AND and OR) between queries. In peptide search, 
user can search a peptide sequence in the database. There 
are two options for peptide search: (i) containing peptide: 
it is for searching user-defined peptide sequence in 
Hemolytik. Users can search whether their peptide of 
interest is present (partially or complete) in any hemolytic 
peptide or not; and (ii) exact search: it aUows users to 
search hemolytic peptides, which are identical to user's 
peptide. SMILES search option provides facihty to 
search SMILES notation of a given peptide against 
Hemolytik peptide database in SMILES format. 

Categorization 

Categorization is basically a powerful browsing facihty 
that aUows users to browse data on the four major fields 
that include (i) source of RBCs used in the assay, (ii) chir- 
ahty of peptides, (hi) function of peptides and (iv) length 
of peptides. In Hemolytik, we have covered 17 different 
sources of RBCs on which peptides have been tested. 
These peptides have shown different hemolytic potencies 
on different RBCs. Using this field, users can browse 
peptides tested on RBCs isolated from a particular 
source. In the cliirality field, three types of peptides have 
been compiled: (i) peptides having all L-amino acids, (ii) 
peptides having all D-amino acids and (iii) peptides having 
both L- and o-amino acids (mixed). User can browse this 



information using chirahty field. In addition, user can 
browse peptides on the basis of their function. We have 
covered peptides with >15 types of functions, including 
anticancer, antimicrobial, cell-penetrating, antiviral, anti- 
parasitic, etc. Peptides based on their length can also be 
browsed. In Hemolytik, peptides' length varies from 2 to 
104 amino acids. However, most of the peptides have 
length between 11 and 15 amino acids. 

BLAST search 

This search tool facihtates users to perform similarity- 
based search against hemolytic and non-hemolytic 
peptides. Users can submit peptide sequences in PASTA 
format and select different parameters like weight matrix 
and expectation value for performing BLAST search (15). 

Smith-Waterman algorithm 

Because Smith-Waterman algorithm (16) performs simi- 
larity search more effectively in case of small peptides, we 
have integrated this tool. This option allows users to 
search hemolytic or non-hemolytic peptides in the 
database that are similar to their peptides. Users can 
search multiple peptide sequences at a time by submitting 
sequences in PASTA format. 

Alignment 

This tool allows users to align their sequences with the 
peptides in Hemolytik database. A user can input 
multiple PASTA sequences in the sequence box and 
peptide IDs of Hemolytik database in the ID box, and 
get aligned sequences. User also has the option to upload 
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a file in PDB format and align its structure with the struc- 
ture of the peptide whose ID is provided in the box. 

Mapping 

It allows the users to map hemolytic peptides on their 
peptide sequence. It allows the user to run a sub-search 
and super-search. In sub-search, a given peptide is mapped 
against all peptides of Hemolytik database, whereas super- 
search returns similar peptides of our database against a 
protein sequence given as query. It also allows user to 
submit protein or polypeptide sequence to identify 
segments that are identical to hemolytic peptides. 



DISCUSSION 

The field of therapeutic peptides is growing very rapidly 
due to substantial technological progress (1,17). Literature 
on therapeutic peptides is rapidly adding (18), and there- 
fore in the past few years only, many comprehensive data- 
bases of various therapeutic peptides including 
antimicrobial peptides (5,6), ceU-penetrating peptides 
(19), tumor-homing peptides (20) and quorum-sensing 
peptides (21) have been developed. In addition, an inter- 
esting and useful resource of blood-brain barrier 
peptides — Brainpeps — (22) has also been developed 
recently. Brainpeps not only provides information of 
blood-brain barrier peptides but also gives comprehensive 
information related to many experimental techniques used 
to study the peptide penetration with various parameters. 
These databases have demonstrated the increasing popu- 
larity of peptides as therapeutic candidates. However, 
despite several advantages, peptides often suffer from 
poor in vivo stability and high toxicity toward eukaryotic 
cells (18), which is often judged by their hemolytic 
activities. Therefore, most of the research on therapeutic 
peptides is currently being focused on designing peptide 
derivatives having low/no hemolytic activities while re- 
taining their therapeutic activity (23,24). Therefore, tre- 
mendous data related to hemolytic peptides and their 
derivatives with their hemolytic potencies have been 
reported in the past. This information may be very 
useful for researchers/scientists to design therapeutic 
peptides without hemolytic potencies. However, no 
attempt has been made to catalog this information, and 
thus, it is currently difficult to access this useful informa- 
tion. Hemolytik is a first comprehensive database of its 
kind that provides experimental information related to 
hemolytic peptides and their potencies. 

Besides collection of hemolytic peptides and their 
potencies, various web-based tools have been integrated 
in Hemolytik database, which facihtate various types of 
analysis. Users can make the best use of Hemolytik in the 
following ways: (i) while designing therapeutic peptides, 
users can check whether their peptides of interest are 
already reported to be hemolytic or not; (h) because 
Hemolytik also contains experimentally determined non- 
hemolytic peptides, users can select the least hemolytic or 
non-hemolytic peptides for further therapeutic applica- 
tions; and (iii) users can exploit the predicted structural 



information of hemolytic peptides for docking or molecu- 
lar dynamics of the pep tide-membrane complex. 

The accurate prediction of therapeutic activities of 
peptides may expedite peptide-based drug discovery. In 
this context, Hemolytik database will be useful for de- 
veloping novel in silico prediction, as well as designing 
methods for hemolytic or non-hemolytic peptides. In the 
recent past, few prediction and quantitative structure 
activity relationship models on various therapeutic 
peptides, including antimicrobial peptides (25-27), cell- 
penetrating peptides (28-31), antioxidant peptides 
(32,33), have been developed, which will be helpful to 
reduce the costs and efforts involved in laboratory 
screening. The optimized structures and SMILES of 
hemolytic peptides available in the Hemolytik database 
can also be used to develop quantitative structure 
activity relationship models for rapid screening of thera- 
peutic peptides with less hemolytic activity. In conclusion, 
Hemolytik would be very useful to the scientific commu- 
nity working in the field of therapeutic peptides. 

UPDATE OF HEMOLYTIK 

The web interface provides an option for users to submit a 
new entry of experimentally determined hemolytic and 
non-hemolytic peptides by filhng HTML form. To 
maintain the high standard of quality, we wiU confirm 
the vahdity of new entry before including in Hemolytik. 
In addition, our team wiU also update this database 
regularly. 

LIMITATIONS AND FUTURE DEVELOPMENTS 

Besides a collection of hemolytic and non-hemolytic 
peptide sequences, Hemolytik database also provides 
structural information of peptides. In Hemolytik, there 
are many peptides that have modified amino acids (e.g. 
4-nitrophenylalanine)/non-proteinogenic moieties. We 
have made an attempt to predict the tertiary structure of 
peptides with non-natural (o-form) and modified amino 
acids (ornithine). However, a limitation of this database 
is that structures of few modified peptides (with complex 
modifications) could not be predicted. In the future, as 
more and more published force field libraries of individual 
modified residues will be available, the prediction of 
peptides having such residues will be feasible. 
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